152        Bioinformatics

Genomes Project. In addition to variant annotation with respect to genes, ANNOVAR

has the ability to perform annotation based on genomic region and to compare variants

to existing variation databases. In general, the types of annotations with ANNOVAR can

be grouped into the following: (i) gene-based annotation which identifies the effects of

variants on the proteins, (ii) region-based annotation identifies the affected region (e.g.,

conserved region), and (iii) filter-based annotation identifies variants based on a specific

database such as dbSNP, ExAC, 1000 Genome Project, and gnomAD. The filter-based

annotation may also generate scores including SIFT, PolyPhen, LRT, MutationTaster,

MutationAssessor, FATHMM, MetaSVM, and MetaLR.

ANNOVAR uses annotation databases to perform the above types of annotation.

The annotation databases are built with the organism annotation file in GFF3 format.

FIGURE 4.15  Number of variant effects by type and region.

FIGURE 4.16  A bar chart shows percentage of variants by region.